An Efficient Character Segmentation Algorithm for Printed Chinese Documents
نویسندگان
چکیده
The character segmentation technology for printed documents is applied in many fields. This paper proposes an efficient character segmentation algorithm for Chinese printed documents, which is suitable for paper watermarking system. This algorithm is composed of three main steps: connected regions recognition, connected regions merging, and fine-gained segmentation, through what the algorithm succeeds in achieving Chinese character segmentation with high accuracy and high consistent segmentation between the digital version and print-scanned version of images from the same documents. Experiments show the effectiveness of the proposed algorithm.
منابع مشابه
A Chinese Character Segmentation Algorithm for Complicated Printed Documents
The character segmentation technology for printed documents plays an important role in optical character recognition, ticket information identification, postal code identification, automatic license plate recognition and so on. In this paper, a Chinese characters segmentation algorithm for complicated printed documents is proposed for the application in paper watermarking system. In this applic...
متن کاملA Modified Character Segmentation Algorithm for Farsi Printed Text Using Upper Contour Labelling
In this paper, a modified segmentation algorithm for printed Farsi words is presented. This algorithm is based on a previous work by Azmi that uses the conditional labeling of the upper contour to find the segmentation points. The main objective is to improve the segmentation results for low quality prints. To achieve this, various modifications on local baseline detection, contour labeling an...
متن کاملA Modified Character Segmentation Algorithm for Farsi Printed Text Using Upper Contour Labelling
In this paper, a modified segmentation algorithm for printed Farsi words is presented. This algorithm is based on a previous work by Azmi that uses the conditional labeling of the upper contour to find the segmentation points. The main objective is to improve the segmentation results for low quality prints. To achieve this, various modifications on local baseline detection, contour labeling an...
متن کاملRecognition-based handwritten Chinese character segmentation using a probabilistic Viterbi algorithm
This paper presents a recognition-based character segmentation method for handwritten Chinese characters. Possible non-linear segmentation paths are initially located using a probabilistic Viterbi algorithm. Candidate segmentation paths are determined by verifying overlapping paths, between-character gaps, and adjacent-path distances. A segmentation graph is then constructed using candidate pat...
متن کاملTowards Unified Chinese Segmentation Algorithm
As Chinese is an ideographic character-based language, the words in the texts are not delimited by spaces. Indexing of Chinese documents is impossible without a proper segmentation algorithm. Many Chinese segmentation algorithms have been proposed in the past. Traditional segmentation algorithms cannot operate without a large dictionary or a large corpus of training data. Nowadays, the Web has ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013